170 research outputs found
Simplified Distributed Programming with Micro Objects
Developing large-scale distributed applications can be a daunting task.
object-based environments have attempted to alleviate problems by providing
distributed objects that look like local objects. We advocate that this
approach has actually only made matters worse, as the developer needs to be
aware of many intricate internal details in order to adequately handle partial
failures. The result is an increase of application complexity. We present an
alternative in which distribution transparency is lessened in favor of clearer
semantics. In particular, we argue that a developer should always be offered
the unambiguous semantics of local objects, and that distribution comes from
copying those objects to where they are needed. We claim that it is often
sufficient to provide only small, immutable objects, along with facilities to
group objects into clusters.Comment: In Proceedings FOCLASA 2010, arXiv:1007.499
Near-Optimal Straggler Mitigation for Distributed Gradient Methods
Modern learning algorithms use gradient descent updates to train inferential
models that best explain data. Scaling these approaches to massive data sizes
requires proper distributed gradient descent schemes where distributed worker
nodes compute partial gradients based on their partial and local data sets, and
send the results to a master node where all the computations are aggregated
into a full gradient and the learning model is updated. However, a major
performance bottleneck that arises is that some of the worker nodes may run
slow. These nodes a.k.a. stragglers can significantly slow down computation as
the slowest node may dictate the overall computational time. We propose a
distributed computing scheme, called Batched Coupon's Collector (BCC) to
alleviate the effect of stragglers in gradient methods. We prove that our BCC
scheme is robust to a near optimal number of random stragglers. We also
empirically demonstrate that our proposed BCC scheme reduces the run-time by up
to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation
strategies. We also generalize the proposed BCC scheme to minimize the
completion time when implementing gradient descent-based algorithms over
heterogeneous worker nodes
A Compositional Semantics for Stochastic Reo Connectors
In this paper we present a compositional semantics for the channel-based
coordination language Reo which enables the analysis of quality of service
(QoS) properties of service compositions. For this purpose, we annotate Reo
channels with stochastic delay rates and explicitly model data-arrival rates at
the boundary of a connector, to capture its interaction with the services that
comprise its environment. We propose Stochastic Reo automata as an extension of
Reo automata, in order to compositionally derive a QoS-aware semantics for Reo.
We further present a translation of Stochastic Reo automata to Continuous-Time
Markov Chains (CTMCs). This translation enables us to use third-party CTMC
verification tools to do an end-to-end performance analysis of service
compositions.Comment: In Proceedings FOCLASA 2010, arXiv:1007.499
Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy
We consider a scenario involving computations over a massive dataset stored
distributedly across multiple workers, which is at the core of distributed
learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework
to simultaneously provide (1) resiliency against stragglers that may prolong
computations; (2) security against Byzantine (or malicious) workers that
deliberately modify the computation for their benefit; and (3)
(information-theoretic) privacy of the dataset amidst possible collusion of
workers. LCC, which leverages the well-known Lagrange polynomial to create
computation redundancy in a novel coded form across workers, can be applied to
any computation scenario in which the function of interest is an arbitrary
multivariate polynomial of the input dataset, hence covering many computations
of interest in machine learning. LCC significantly generalizes prior works to
go beyond linear computations. It also enables secure and private computing in
distributed settings, improving the computation and communication efficiency of
the state-of-the-art. Furthermore, we prove the optimality of LCC by showing
that it achieves the optimal tradeoff between resiliency, security, and
privacy, i.e., in terms of tolerating the maximum number of stragglers and
adversaries, and providing data privacy against the maximum number of colluding
workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the
conventional uncoded implementation of distributed least-squares linear
regression by up to , and also achieves a
- speedup over the state-of-the-art straggler
mitigation strategies
Robustness of Equations Under Operational Extensions
Sound behavioral equations on open terms may become unsound after
conservative extensions of the underlying operational semantics. Providing
criteria under which such equations are preserved is extremely useful; in
particular, it can avoid the need to repeat proofs when extending the specified
language.
This paper investigates preservation of sound equations for several notions
of bisimilarity on open terms: closed-instance (ci-)bisimilarity and
formal-hypothesis (fh-)bisimilarity, both due to Robert de Simone, and
hypothesis-preserving (hp-)bisimilarity, due to Arend Rensink. For both
fh-bisimilarity and hp-bisimilarity, we prove that arbitrary sound equations on
open terms are preserved by all disjoint extensions which do not add labels. We
also define slight variations of fh- and hp-bisimilarity such that all sound
equations are preserved by arbitrary disjoint extensions. Finally, we give two
sets of syntactic criteria (on equations, resp. operational extensions) and
prove each of them to be sufficient for preserving ci-bisimilarity.Comment: In Proceedings EXPRESS'10, arXiv:1011.601
- …